SemanticScuttle - klotz.me » Tags: training+deep learning

Tags: training* + deep learning*

0 bookmark(s) - Sort by: Date ↓ / Title /

Grokfast: Accelerated Grokking by Amplifying Slow Gradients

This paper presents a method to accelerate the grokking phenomenon, where a model's generalization improves with more training iterations after an initial overfitting stage. The authors propose a simple algorithmic modification to existing optimizers that filters out the fast-varying components of the gradients and amplifies the slow-varying components, thereby accelerating the grokking effect.

2024-08-19 Tags: grokking, deep learning, optimization techniques, gradient filtering, llm, training, eric hartford by klotz
How to train your large language model: A new technique speeds up the process

This article discusses the process of training a large language model (LLM) using reinforcement learning from human feedback (RLHF) and a new alternative method called Direct Preference Optimization (DPO). The article explains how these methods help align the LLM with human expectations and make it more efficient.

2024-05-15 Tags: llm, reinforcement learning, human feedback, openai, chatgpt, rlhf, dpo, training by klotz
Mastering LLM Techniques: Training

Delving into transformer networks

2023-11-18 Tags: nvidia, llm, training, transformers, deep learning by klotz
[1803.03635] The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

2020-05-02 Tags: deep learning, training, lottery ticket, optimization, mit by klotz
[keras.io] Image classification from scratch.ipynb - Colaboratory

2020-04-29 Tags: image processing, classification, deep learning, training, keras, google collab, tutorial, python by klotz
What Machine Learning needs from Hardware – Pete Warden's blog

2019-04-14 Tags: deep learning, inference, training, hardware by klotz
“GluonNLP 0.6: Closing the Gap in Reproducible Research with BERT”

2019-03-20 Tags: bert, nlp, deep learning, training by klotz
Neural Networks - Hyperparameter Tuning, Regularization & Optimization

2018-11-13 Tags: neural network, training, machine learning, overfitting, fitting, bias by klotz
“Why Visualize Gradient Descent Optimization Algorithms ?”

2018-09-09 Tags: deep learning, training, sgd by klotz
Data For Everyone

2018-07-18 Tags: data, training, nlp, image, deep learning by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0

About - Propulsed by SemanticScuttle